Source line driver for three dimensional non-volatile memory

ABSTRACT

A non-volatile storage system includes a plurality of non-volatile memory cells configured to form a monolithic three dimensional memory structure, a plurality of bit lines connected to the memory cells, a plurality of source lines connected to the memory cells, a plurality of bit line drivers connected to the bit lines and a plurality of source line drivers connected to the source lines and the bit lines. The source line drivers apply voltages to the source lines based on bit line voltages.

CLAIM OF PRIORITY

This application claims the benefit of U.S. Provisional Application 62/244,942, filed Oct. 22, 2015, incorporated herein by reference in its entirety.

BACKGROUND

Recently, ultra high density storage devices have been proposed using a three dimensional (3D) stacked memory structure sometimes referred to as a Bit Cost Scalable (BiCS) architecture. For example, a 3D NAND stacked memory device can be formed from an array of alternating conductive and dielectric layers. A memory hole is drilled in the layers to define many memory layers. A NAND string is then formed by filling the memory hole with appropriate materials. A straight NAND string (I-BiCS) extends in one memory hole, while a pipe- or U-shaped NAND string (P-BiCS) includes a pair of vertical columns of memory cells which extend in two memory holes and which are joined by a bottom back gate. Control gates of the memory cells are provided by the conductive layers. However, various challenges are presented in operating such memory devices.

BRIEF DESCRIPTION OF THE DRAWINGS

Like-numbered elements refer to common components in the different figures.

FIG. 1A is a perspective view of a portion of a 3D stacked non-volatile memory.

FIG. 1B is a functional block diagram of a memory system that includes the 3D stacked non-volatile memory of FIG. 1A.

FIG. 1C and FIG. 1D show the positioning of the memory array, source line drivers and bit line drivers.

FIG. 1E is a block diagram of a sense block.

FIG. 2 is a perspective view of a portion of a 3D stacked non-volatile memory.

FIG. 3 is a side view of a portion of a 3D stacked non-volatile memory.

FIG. 4 is a cross sectional and perspective view of a column of the 3D stacked non-volatile memory.

FIG. 5 is a block diagram of a 3D memory array.

FIG. 6 is a top view of one layer of the 3D stacked non-volatile memory.

FIG. 7 is a side view of a portion of a 3D stacked non-volatile memory.

FIG. 8 depicts a set of threshold voltage distributions representing data states.

FIGS. 9A-9E depict a programming process.

FIG. 10 is a flow chart describing one embodiment of a process for programming

FIG. 11 is a table identifying various voltages applied to the individual source lines and bit lines.

FIG. 12 is a table identifying various voltages applied to word lines and select gate lines.

FIG. 13 depicts the voltage applied to the selected word line during a programming process.

FIG. 14 is a flow chart describing one embodiment of a process for verifying.

FIG. 15 is a flow chart describing one embodiment of a process for reading.

FIG. 16 is a table of voltages used during one embodiment of programming and verification of programming.

FIG. 17A depicts threshold voltages and programming a first pass of a multi-pass programming process that programs high states first.

FIG. 17B depicts threshold voltages and programming a second pass of a multi-pass programming process that programs high states first.

FIG. 18 is a table of voltages used for a first pass of a multi-pass programming process that programs high states first.

FIG. 19 is a table of voltages used for a second pass of a multi-pass programming process that programs high states first.

FIG. 20 is a block diagram of one embodiment of a source line driver.

FIG. 21 is a flow chart describing one embodiment of a process for operating a memory system with a source line driver.

FIG. 22A is a flow chart describing one embodiment of a process for operating a source line driver during verify or read operations.

FIG. 22B is a flow chart describing one embodiment of a process for operating a source line driver during programming operations.

FIG. 23 is a schematic diagram of one embodiment of a source line driver.

FIG. 24 is a schematic diagram of one embodiment of a source line driver.

DETAILED DESCRIPTION

A non-volatile storage system is proposed that includes a plurality of non-volatile memory cells configured to form a monolithic three dimensional memory structure, a plurality of bit lines connected to the memory cells, a plurality of source lines connected to the memory cells, a plurality of bit line drivers connected to the bit lines and a plurality of source line drivers connected to the source lines and the bit lines. The source line drivers apply voltages to the source lines based on bit line voltages.

One embodiment of the three dimensional stacked non-volatile memory device comprises alternating dielectric layers and conductive layers in a stack, a plurality of bit lines below the stack, and a plurality of source lines above the stack. There is a separate source line associated with each bit line, rather than one source line for an entire block, plane or array. Each source line is connected to a different subset of NAND strings. Each bit line is connected to a different subset of NAND strings. Because the bit lines are below the stack, there is no need for signal lines to carry signals from the substrate surface to the top of the stack for the bit lines and no crowding of lines occurs as bit lines try to pass through source lines when they are both at minimum pitch. Since bit line driver circuits are bigger than source line driver circuits, one embodiment locates bit lines underneath the memory array so that the bit line drivers residing on the silicon surface under the memory array have direct access to the bit lines. In one embodiment, the source line drivers, being smaller in size, are placed on the side of the memory array. Since the source line drivers are smaller than the bit line drivers and also smaller than the traditional sense amp circuits, this arrangement shrinks the memory die size by saving the area which is traditionally reserved for sense amplifiers (the traditional bit line drivers).

The three dimensional stacked memory device comprises a plurality of memory cells arranged in blocks. Each block includes memory holes (or pillars) which extend vertically in the stack, and comprise a column of memory cells such as, for example, in a NAND string. The three dimensional stacked non-volatile memory device includes N layers. The memory holes are divided into four groups at each level of a block and each group has a separate set of source side and drain side select signals so that a subset of memory holes can be active at any given time. Because of the concurrency in the programming and verifying, the number of programming and verify pulses is reduced and the overall programming process is faster than other architectures. This is enabled because each memory channel/hole in a selected group has its own dedicated source line in addition to having its own dedicated bit line. With this architecture each memory channel can be driven to its own designated voltage at both its source line and its bit line. This provides full control of the channel potential. Each channel can have one of a number of different potentials applied to it based on what data state is to be programmed on the memory cell that is along that channel and belongs to the selected word line. A data state is a condition of the memory cell which correlates to storing a predefined pattern of data. The meaning of a data state can change based on the type of memory technology used in various embodiments. For example, in a multi-level memory cell different threshold voltage levels for the cell may correlate to a particular data pattern that represents data settings on two or more logical levels of data stored in the multi-level memory cell. In another example, the data state may comprise the level of resistance for a filament formed in the cell. In another example, the data state may comprise the magnetic orientation of a magnetic layer in a Spin-transfer torque random access memory cell (STT-RAM).

The proposed structure allows for multiple data states to be verified concurrently, as is explained below. Memory cells are concurrently programmed to different data states, with memory cells being programmed to lower data states having their programming slowed by applying appropriate source line voltages and bit line voltages. In one embodiment, reading is performed sequentially for the data states.

FIG. 1A is a perspective view of a portion of a 3D stacked non-volatile memory device. The memory device 100 includes a substrate 101. On the substrate are example blocks BLKO and BLK1 of memory cells and a peripheral area 104 with circuitry for use by the blocks. The substrate 101 can also carry circuitry under the blocks, along with one or more metal layers lower than the bit line layer which are patterned in conductive paths to carry signals of the circuitry. The blocks are formed in an intermediate region 102 of the memory device. In an upper region 103 of the memory device, one or more upper metal layers are patterned in conductive paths to carry signals of the circuitry. Each block comprises a stacked area of memory cells, where alternating levels of the stack represent word lines. While two blocks are depicted in FIG. 1A as an example, additional blocks can be used, extending in the x- and/or y-directions.

In one possible approach, the length of the plane, in the x-direction, represents a direction in which word lines extend, and the width of the plane, in the y-direction, represents a direction in which bit lines extend. The z-direction represents a height of the memory device.

FIG. 1B is a functional block diagram of the 3D stacked non-volatile memory device 100 of FIG. 1A. The memory device 100 may include one or more memory die 108. The memory die 108 includes a memory array (or other memory structure) 126 of memory cells, control circuitry 110, and read/write circuits 128. The memory array 126 is addressable by word lines via a row decoder 124 and by bit lines via a column decoder 132. The read/write circuits 128 include multiple sense blocks 130 (sensing circuitry) and allow a page (or other unit) of memory cells to be read or programmed in parallel. In some embodiments, a controller 122 is included in the same memory device 100 (e.g., a removable storage card) as the one or more memory die 108. In other embodiments, controller 122 is separated from the memory die 108. Commands and data are transferred between the host and controller 122 via lines 120 and between the controller and the one or more memory die 108 via lines 118.

The control circuitry 110 cooperates with the read/write circuits 128 to perform memory operations on the memory array 126, and includes a state machine 112, an on-chip address decoder 114, and a power control module 116. The state machine 112 provides chip-level control of memory operations. The on-chip address decoder 114 provides an address interface between that used by the host or a memory controller to the hardware address used by the decoders 124 and 132. The power control module 116 controls the power and voltages supplied to the word lines and bit lines during memory operations. It can include drivers for word lines, source side select lines (SGS) and drain side select lines (SGD) and source lines. The sense blocks 130 include bit line drivers and circuits for sensing. Control circuitry 110 is also in communication with source control circuits 127, which includes source line driver circuit 1, source line driver circuit 2, . . . , source line driver circuit p. The source line driver circuits are used to drive different (or the same) voltages on the individual source lines. The present architecture provides individual control of one source line per active memory cell. Hundreds of thousands (for example about 300,000) of source line driver circuits are required in addition to the same number of bit line driver circuits.

In some implementations, some of the components can be combined. In various designs, one or more of the components (alone or in combination), other than memory array 126, can be thought of as a one control circuit. For example, a control circuit may include any one of, or a combination of, control circuitry 110, state machine 112, decoders 114/124/132, power control module 116, sense blocks 130, source control circuits 127, read/write circuits 128, and controller 122, and so forth.

FIG. 1C and FIG. 1D show the positioning of the memory array 126, source line drivers (SL Driver) and bit line drivers (BL Driver). FIG. 1C shows an embodiment where bit line drivers (BL Driver) are below the memory array 126 and source line drivers (SL Driver) are to the side of memory array 126. FIG. 1C also shows an example source line SL above memory array 126 and an example bit line BL below memory array 126. In one embodiment, the SL Driver includes a unity gain buffer for matching BL voltage during programming and a low Vth single transistor amp (source follower) for subtracting ˜0.5V from VBL to apply to SL at other times. FIG. 1D shows the embodiment where bit line drivers (BL Driver) and source line drivers (SL Driver) are below memory array 126. One of the metal layers bellow the memory layer will be consumed. In one embodiment, there would be 3 available metal layers, for example, for connecting the bit line drivers, but only two layers available for connecting the source line drivers. It also means that the layer below the bit line layer becomes a critical layer at minimum pitch (in one example implementation). FIGS. 1 C/D show how bit lines and source lines can coexist without any difficulty encountered when one set try to pass through the other set. No such difficulty exists because one set does not need to try to pass through the other set. FIGS. 1C/D illustrate that both sets can be comfortably connected to their drivers without having to cross each other's metal layers. Any 3D memory architecture that has vertical channels as well as channels fabricated above metal layers (e.g. poly silicon channels as opposed to crystalline silicon channel) can benefit from the attributes of this architecture. Note that crystalline channels require crystalline seed layer of silicon from which the crystalline silicon channel can be grown by epitaxy.

FIG. 1E is a block diagram of an individual sense block 130 partitioned into a core portion, referred to as a sense module 480, and a common portion 490. In one embodiment, there will be a separate sense module 480 for each bit line and one common portion 490 for a set of multiple sense modules 480. In one example, a sense block will include one common portion 490 and eight sense modules 480. Each of the sense modules in a group will communicate with the associated common portion via a data bus 472.

Sense module 480 comprises sense circuitry 470 that determines whether a conduction current in a connected bit line is above or below a predetermined level. In some embodiments, sense module 480 includes a circuit commonly referred to as a sense amplifier. Sense module 480 also includes a bit line latch 482 that is used to set a voltage condition on the connected bit line. For example, a predetermined state latched in bit line latch 482 will result in the connected bit line being pulled to a state designating program inhibit. (e.g., 2*Vdd or just under 2*Vdd)

Common portion 490 comprises a processor 492, a set of data latches 494 and an I/O Interface 496 coupled between the set of data latches 494 and data bus 420. Processor 492 performs computations. For example, one of its functions is to determine the data stored in the sensed memory cell and store the determined data in the set of data latches. The set of data latches 494 is used to store data bits determined by processor 492 during a read operation. It is also used to store data bits imported from the data bus 420 during a program operation. The imported data bits represent write data meant to be programmed into the memory. I/O interface 496 provides an interface between data latches 494 and the data bus 420.

During sensing (i.e. read or verify), the operation of the system is under the control of state machine 112 that controls the supply of different control gate voltages to the addressed cell. As it steps through one or more predefined control gate voltages (the read reference voltages or the verify reference voltages) corresponding to the various memory states supported by the memory, the sense module 480 may trip at one of these voltages and an output will be provided from sense module 480 to processor 492 via bus 472. At that point, processor 492 determines the resultant memory state by consideration of the tripping event(s) of the sense module and the information about the applied control gate voltage from the state machine via input lines 493. It then computes a binary encoding for the memory state and stores the resultant data bits into data latches 494. In another embodiment of the core portion, bit line latch 482 serves double duty, both as a latch for latching the output of the sense module 480 and also as a bit line latch as described above.

During program or verify, the data to be programmed is stored in the set of data latches 494 from the data bus 420. The program operation, under the control of the state machine, comprises a series of programming voltage pulses (with increasing magnitudes) each of which is concurrently applied to the control gates of a set of addressed memory cells so that the group of memory cells are programmed at the same time. Each programming pulse is followed by a verify process to determine if the memory cell has been programmed to the desired state. Processor 492 monitors the verified memory state relative to the desired memory state. When the two are in agreement, processor 492 sets the bit line latch 482 so as to cause the bit line to be pulled to a highest available bit line voltage that is designated to provide program inhibit. This inhibits the memory cell coupled to the bit line from further programming even if it is subjected to programming pulses on its control gate. In other embodiments the processor initially loads the bit line latch 482 and the sense circuitry sets it to an inhibit value during the verify process.

Data latch stack 494 contains a stack of data latches corresponding to the sense module. In one embodiment, there are three (or four or another number) of data latches per sense module 480. In some implementations (but not required), the data latches are implemented as a shift register so that the parallel data stored therein is converted to serial data for data bus 420, and vice versa. In one preferred embodiment, all the data latches corresponding to the read/write block of memory cells can be linked together to form a block shift register so that a block of data can be input or output by serial transfer. In particular, the bank of read/write modules is adapted so that each of its set of data latches will shift data into or out of the data bus in sequence as if they are part of a shift register for the entire read/write block.

During a memory operation (such as programming, verifying or reading), sense circuitry 470 is responsible for applying a bit voltage to the respective bit line. As discussed below, during programming and verification, the bit line voltages are data dependent based on the target data state that the relevant memory cell connected to the bit line is being programmed to. Processor 492 reads the data being programmed from data latches 494 and configures sense circuitry 470 to drive the appropriate data dependent voltage on the bit line based on the data read from the data latches 494.

FIG. 2 is a perspective view of a portion of one embodiment of memory array 126 that is a three dimensional stacked non-volatile memory comprising alternating dielectric layers and conductive layers in a stack, a plurality of bit lines below the stack, and a plurality of source lines above the stack. For example, FIG. 2 shows conductive layers 202, 204, 206, 208, 210, 212, 214, 216, and 218, each of which operates as a word line and, therefore, can be referred to as a word line layer. To allow the drawing to fit on one page and be readable, not all of the conductive layers are depicted. For example, FIG. 2 does not show any of the conductive layers operating as source side select layers (SGSs) and drain side select layers (SGDs). One embodiment may include 60 conductive layers, with 48 conductive layers operating as word line layers, two layers above the 48 word line layers as dummy layers on the source side, four layers above dummy source layers operating as source side select layers (SGS), two layers below the 48 word line layers as dummy layers on the drain side, four layers below dummy drain layers operating as drain side select layers (SGDs). Other embodiments can implement different numbers of word line layers, dummy layers, source side select layers and drain side select layers.

Between the conductive layers are dielectric layers. Many different dielectric materials can be used. One example of a suitable dielectric material is SiO2. Note that FIG. 2 does not show the dielectric material between the conductive layers.

Below the stack of alternating dielectric layers and conductive layers are multiple separate and isolated bit lines 220, 222, 224, 226, 228, 230, 232, and 234. Although FIG. 2 only shows eight bit lines, the memory system is likely to have many more than eight bit lines (e.g. 300,000).

Above the stack of alternating dielectric layers and conductive layers are multiple separate and isolated source lines 240, 242, 244, 246, 248, 250, 252 and 254. Although FIG. 2 only shows eight source lines, the memory system is likely to have many more than eight source lines (e.g. 300,000). In one embodiment, bit line drivers (which include the sense amps) are located below the memory array (stack of layers) while the source line drivers are located to the side of the memory array. In another embodiment, both bit line drivers and source line drivers are located under the memory array. This provides further die size savings at the expense of consuming one of a number of available metal layers over the source line drivers and under the bit lines for connecting the source line drivers to the source lines. The number of available metal layers above the silicon surface and below the bit line layer in certain embodiments is either three or four. This does not include the contact and via layers. If we also count these contact and via layers, the number of metal layers below memory array (including the bit line layer plus its via layer below it, and its via layer above it) add up to nine layers in one embodiment or to eleven layers in another embodiment. Contact and via layers typically provide vertical connectivity in Z direction, whereas the other metal layers provide both vertical and horizontal connectivity within the plane of the chip.

The stack of alternating dielectric layers and conductive layers includes memory holes or pillars which extend vertically in the stack, and comprise a column of memory cells such as in a NAND string. FIG. 2 shows columns/holes/pillars 260, 262, 264, 266, 268, 270, and 272. Although FIG. 2 only shows seven columns, the memory system is likely to have many more than seven columns As depicted, each conductive layer will surround a set of columns, with one memory cell residing at the intersection of each column and each of the conductive layers designated to be function as word lines.

Each bit line is connected to a subset of columns. For example, FIG. 2 shows bit line 230 connected to column 272, bit line 224 connected to column 270, bit line 220 connected to column 268 (note that column 268 is only partially depicted), and bit line 222 connected to column 262. Note that the terms “connected,” “coupled” and “in communication with” include direct connections and connections via other components. The bit lines connect to the columns through a combination of vias and plugs. For example, bit line 230 is connected to column 272 by via 284 and plug 274, bit line 224 is connected to column 270 by via 286 and plug 276, bit line 220 is connected to column 268 by via 288 and plug 278, and bit line 222 is connected to column 262 by via 290 and plug 280.

Each source line is connected to a subset of columns In one embodiment, the source lines connect to the columns through vias and plugs. FIG. 2 shows plugs 291, 292, 293 and 294, as well as vias 295 and 296. Many of the vias for the source lines are hidden due to the perspective view. However, FIG. 2 does show column 270 connected to source line 244 by via 295 and plug 292.

The source lines are not connected together and can carry different signals. In one embodiment, each source line is associated with a bit line to create a source line/bit line pair. The system includes many source line/bit line pairs. Each bit line is associated with a different and separate source line. A source line is connected to the same column as its associated bit line of the source line/bit line pairs. For example, bit line 230 is associated with source line 252 and both are connected to column 272, bit line 224 is associated with source line 244 and both are connected to column 270, bit line 220 is associated with source line 240 and both are connected to column 268, and bit line 222 is associated with source line 242 and both are connected to column 262. In one embodiment, the bit lines are made of Tungsten, the source lines are made of Copper or Tungsten, the vias are made of Tungsten and the plugs are made of polysilicon. In one embodiment, the conductive word line layers are made of Tungsten. Tungsten may be preferable as it can withstand the process thermal budget associated with processing the layers above it, and the required dopant activation or polysilicon channel grain size expansion anneal steps that follow the deposition of the Tungsten.

FIG. 3 is a side view of the structure depicted in FIG. 2. Like FIG. 2, although FIG. 3 shows conductive layers 202, 204, 206, 208, 210, 212, 214, 216, and 218, FIG. 3 does not explicitly depict the dielectric layers between the conductive layers. Furthermore, FIG. 3 (like FIG. 2) only shows a subset of the conductive layers.

FIG. 4 is a perspective view of a cross section of a column from the memory array 126 (stack) described above. Each column includes a number of layers which are deposited along the sidewalls of the column. These layers formed on the sidewall of the memory holes can include, from the outer perimeter of the hole moving radially in toward the center, a charge trapping layer such as a specially formulated silicon nitride that increases trap density, followed by oxide-nitride-oxide (O-N-O) stack layer that acts as a band gap engineered tunnel dielectric, followed by polysilicon layer(s), followed by the inner most dielectric such as silicon oxide core fill. These layers are deposited using methods such as atomic layer deposition, chemical vapor deposition, or physical vapor deposition. There are many other intermediary steps such as anneals, densifications and sacrificial layers that are temporarily deposited and later removed. The inner most oxide of the ONO tunnel dielectric that is in contact with the polysilicon channel can be created by converting some thickness of the deposited nitride layer to oxide by methods such as ISSG (In-Situ Steam Generation). Other layers of the memory cell structure can be formed by depositions into the horizontal voids left behind after the sacrificial nitride layers are etched out, as opposed to deposition in the memory hole. Such layers can include the silicon oxide blocking layer and the aluminum oxide high K transition layer between the silicon oxide blocking layer and the word line. The word line deposition can start with a titanium nitride layer deposited on aluminum oxide followed by a tungsten seed layer deposition and then the remainder of the cavities for word line fingers can be filled with tungsten. Inside the cavities between word line layers, for example, a blocking oxide (SiO2) can be deposited. The Blocking Oxide surrounds the charge trapping layer. Surrounding the Blocking Oxide, and between the Blocking Oxide and the Word Line (TiN+Tungsten) is an Aluminum Oxide layer. In other embodiments, not shown in FIG. 2, the blocking oxide can be the first layer formed in the memory hole as opposed to the first layer formed in the cavities left behind after the sacrificial nitride layers are removed. The polysilicon channel is connected to a bit line at the bottom of the column and connected to the associated source line at the top of the column through intermediary deposited patterned layers including a metal via and a doped polysilicon plug, as discussed above. The polysilicon plugs can be n-type, preferably doped with some combination of Arsenic or phosphorus, or they can be p-type preferably doped with some combination of Boron or indium. In some embodiments Arsenic and indium are preferable because they diffuse more slowly during high temperature anneals which are required for poly crystalline grain size changes and other purposes.

When a memory cell is programmed, electrons are stored in a portion of the charge trapping layer which is associated with the memory cell. These electrons are drawn into the charge trapping layer from the polysilicon channel, and through the ONO tunnel dielectric. The threshold voltage (Vth) of a memory cell is increased in proportion to the amount of stored charge.

Each of the memory holes is thus filled with a plurality of annular layers comprising sometimes a blocking layer, usually a charge trapping layer, usually a tunnel dielectric multi-layers structure, a channel layer, and a channel layer. A core region of each of the memory holes is filled with a body material, and the plurality of annular layers are between the core region and the WLs that surround each of the memory holes.

Looking back at FIG. 2, memory system 100 includes a memory array 126 having the structure depicted in FIGS. 2, 3, and 4. FIG. 5 is a block diagram explaining the organization of memory array 126, which is divided into two planes 502 and 504. Each plane is then divided into N blocks. In one example, each plane has approximately 2000 blocks. However, different numbers of blocks and planes can also be used.

FIG. 6 is a block diagram depicting a portion of a top view of one layer of one block. The portion of the block depicted in FIG. 6 corresponds to box 450 in block 2 of FIG. 5. As can be seen from FIG. 5, the block depicted in FIG. 6 extends in the direction of arrow 632 and in the direction of arrow 630. In one embodiment, the memory array will have 48 memory layers; therefore, each block will have 48 layers. However, FIG. 6 only shows one layer. Each layer of a block has only one word line. For example, the layer of block 2 depicted in FIG. 6 includes word line 210 (see FIG. 2) surrounding a plurality of circles. Each circle represents a column (see FIG. 4). FIG. 6 has reference numbers for columns 270 (see FIG. 2), 272 (see FIGS. 2), 650, 652, 654, 656, 658, 670, 672, 674, 676 and 678. Not all columns are provided with reference numbers in order to keep FIG. 6 readable. Some of the circles are shaded to indicate that those columns will not be used to store data, and are sacrificed to provide spacing.

FIG. 6 also shows dashed vertical lines. These are the bit lines. FIG. 6 shows sixteen bit lines: 220, 222, 224, 226, 228, 230, 232, 234, 604, 606, 608, 610, 612, 614 and 616. The lines are dashed to indicate that the bit lines are not part of this layer, rather they are below the stack. Each of the non-shaded circles has an “x” to indicate its connection to a bit line.

FIG. 6 does not show the source lines in order to keep the drawing readable. However, the source lines would be in the same position as the bit lines, but located above the stack rather than below. The source lines would connect to the columns in the same manner as the bit lines. Therefore, a source line and its associated bit line of a source line/bit line pair connect to the same column In this manner, the structure of the source lines is symmetrical to the structure of the bit lines. Thus, for every active column, there is a dedicated bit line and source line. If multiple columns are active at the same time, then each of the active columns has a unique dedicated bit line and a unique dedicated source line.

As can be seen from FIG. 6, each block has sixteen rows of active columns and each bit line connects to four columns in each block. For example, bit line 228 is connected to columns 652, 654, 670 and 674. Since all of these columns 652, 654, 670 and 674 are connected to the same word line 210, the system uses the source side select lines and the drain side select lines to choose one (or another subset) of the four to be subjected to a memory operation (program, verify, read, and/or erase).

FIG. 7 is a side cutaway view of a portion of the memory array, along bit line 228 and source line 254. Note that bit line 228 is the associated bit line for source line 254, therefore, forming a source line/bit line pair.. FIG. 7 shows that while the word line layers extend across the entire block, the source side select lines and the drain side select lines are broken up into four sections. In one embodiment, each source side select line is implemented as four vertical layers connected together. Within each block, there are source side select lines: SGS0, SGS1, SGS2 and SGS3. Similarly, the drain side select lines are implemented as four vertical layers connected together. Within each block, there are four drain side select lines: SGD0, SGD1, SGD2 and SGD3. In one embodiment, SGS0 and SGD0 are used to control columns 674 and 676, SGS1 and SGD1 are used to control columns 672 and 670, SGS2 and SGD2 are used to control columns 654 and 656, and SGS3 and SGD3 are used to control columns 272 and 652.

FIG. 8 illustrates example threshold voltage distributions for the memory cell array when each memory cell stores three bits of data. Other embodiments, however, may use more or less than three bits of data per memory cell (e.g., such as two bits of data per memory cell, or four bits of data per memory cell). In the example of FIG. 8, there are eight valid threshold voltage distributions, also called data states (or target states): S0, S1, S2, S3, S4, S5, S6 and S7. In one embodiment, data state S0 is below 0 volts and data states S1-S7 are above 0 volts. In other embodiments, all eight data states are above 0 volts, or other arrangements can be implemented. In one embodiment, the threshold voltage distribution for S0 is wider than for S1-S7. In one embodiment, S0 is for erased memory cells. Data is programmed from S0 to S1-S7.

Each data state corresponds to a unique value for the three data bits stored in the memory cell. In one embodiment, S0=111, S1=110, S2=101, S3=100, S4=011, S5=010, S6=001 and S7=000. Other mapping of data to states S0-S7 can also be used. The specific relationship between the data programmed into the memory cell and the threshold voltage levels of the cell depends upon the data encoding scheme adopted for the cells. For example, U.S. Pat. No. 6,222,762 and U.S. Patent Application Publication No. 2004/0255090, “Tracking Cells For A Memory System,” filed on Jun. 13, 2003, describe various data encoding schemes for multi-state flash memory cells. In one embodiment, data values are assigned to the threshold voltage ranges using a Gray code assignment so that if the threshold voltage of a floating gate erroneously shifts to its neighboring threshold voltage distribution, only one bit will be affected. However, in other embodiments, Gray code is not used.

In one embodiment, all of the bits of data stored in a memory cell are stored in the same logical page. In other embodiments, each bit of data stored in a memory cell corresponds to different logical pages. Thus, a memory cell storing three bits of data would include data in a first page, data in a second page and data in a third page. In some embodiments, all of the memory cells connected to the same word line would store data in the same three pages of data. In some embodiments, the memory cells connected to a word line can be grouped into different sets of pages (e.g., by odd and even bit lines, or by other arrangements).

In some devices, the memory cells will be erased to state S0. From state S0, the memory cells can be programmed to any of states S1-S7. In one embodiment, known as full sequence programming, memory cells can be programmed from the erased state S0 directly to any of the programmed states S1-S7. For example, a population of memory cells to be programmed may first be erased so that all memory cells in the population are in erased state S0. While some memory cells are being programmed from state S0 to state S1, other memory cells are being programmed from state S0 to state S2, state S0 to state S3, state S0 to state S4, state S0 to state S5, state S0 to state S6, and state S0 to state S7. Full sequence programming is graphically depicted by the seven curved arrows of FIG. 8. In other embodiments, memory cells can be programmed using a coarse/fine methodology or other scheme.

FIG. 8 shows a set of verify target levels Vv1, Vv2, Vv3, Vv4, Vv5, Vv6, and Vv7. These verify levels are used as comparison levels (also known as target levels and/or compare levels) during the programming process. For example, when programming memory cells to state S1, the system will check to see if the threshold voltages of the memory cells have reached Vv1. If the threshold voltage of a memory cell has not reached Vv1, then programming will continue for that memory cell until its threshold voltage is greater than or equal to Vv1. If the threshold voltage of a memory cell has reached Vv1, then programming will stop for that memory cell. Verify target level Vv2 is used for memory cells being programmed to state S2. Verify target level Vv3 is used for memory cells being programmed to state S3. Verify target level Vv4 is used for memory cells being programmed to state S4. Verify target level Vv5 is used for memory cells being programmed to state S5. Verify target level Vv6 is used for memory cells being programmed to state S6. Verify target level Vv7 is used for memory cells being programmed to state S7.

FIG. 8 also shows a set of read compare levels Vr1, Vr2, Vr3, Vr4, Vr5, Vr6, and Vr7. These read compare levels are used as comparison levels during the read process. By testing whether the memory cells turn on or remain off in response to the read compare levels Vr1, Vr2, Vr3, Vr4, Vr5, Vr6, and Vr7 being separately applied to the control gates of the memory cells, the system can determine for which states that memory cells are storing data. In one embodiment, Vr1=0.2v, Vr2=1.0v, Vr3=1.8v, Vr4=2.6v, Vr5=3.4v, Vr6=4.2v and Vr7=5.0v. However, other values can also be used.

In general, during verify operations and read operations, the selected word line is connected to a voltage, a level of which is specified for each read operation (e.g. see read compare levels Vr1, Vr2, Vr3, Vr4, Vr5, Vr6, and Vr7, of FIG. 8) or verify operation (e.g. one voltage is used to verify all states, as discussed below) in order to determine whether a threshold voltage of the concerned memory cell has reached such level. After applying the word line voltage, the conduction current of the memory cell is measured to determine whether the memory cell turned on (conducted current) in response to the voltage applied to the word line. If the conduction current is measured to be greater than a certain value, then it is assumed that the memory cell turned on and the voltage applied to the word line is greater than the threshold voltage of the memory cell. If the conduction current is not measured to be greater than the certain value, then it is assumed that the memory cell did not turn on and the voltage applied to the word line is not greater than the threshold voltage of the memory cell. During a read or verify process, the unselected memory cells on selected columns (i.e. NAND chains) corresponding to a selected word line (i.e. finger) are provided with one or more read pass voltages at their control gates so that these memory cells will operate as pass gates (e.g., conducting current regardless of whether they are programmed or erased).

There are many ways to measure the conduction current of a memory cell during a read or verify operation. In one example, the conduction current of a memory cell is measured by the rate it discharges or charges a dedicated capacitor in the sense amplifier while maintaining a specified bit line voltage. In another example, the conduction current of the selected memory cell allows (or fails to allow) the NAND string that includes the memory cell to discharge a corresponding bit line. The voltage on the bit line is measured after a period of time to see whether it has been discharged or not. Note that the technology described herein can be used with different methods known in the art for verifying/reading. Other read and verify techniques known in the art can also be used.

In some embodiments, the program voltage applied to the control gate includes a series of pulses that are increased in magnitude with each successive pulse by a predetermined step size (e.g. 0.2v, 0.3v, 0.4v, 0.6v, or others). Between pulses, some memory systems will verify whether the individual memory cells have reached their respective target threshold voltage ranges.

FIGS. 9A-9E depict one example programming process that uses six Vpgm program pulses on the selected word line to achieve threshold voltage distributions as per FIG. 8. Initially, in one embodiment, all memory cells being programmed are erased to data state S0. After erasing, a first Vpgm program pulse is applied. In one embodiment, the first Vpgm program pulse is at 19v. All memory cells being programmed will receive that same Vpgm program pulse. However, data dependent voltages are individually applied to the different bit lines and the different source lines so that memory cells being programmed to higher data states (e.g., S7) will increase in threshold voltage more quickly and memory cells being programmed to lower data states (e.g., S1) will increase in threshold voltage slower. The voltages applied to the bit lines and source lines are based on the target data state. Therefore, all memory cells being programmed to S1 will be subjected to a first bit line voltage and a first source line voltage, all memory cells being programmed to S2 will be subjected to a second bit line voltage and a second source line voltage, all memory cells being programmed to S3 will be subjected to a third bit line voltage and a third source line voltage, all memory cells being programmed to S4 will be subjected to a fourth bit line voltage and a fourth source line voltage, all memory cells being programmed to S5 will be subjected to a fifth bit line voltage and a fifth source line voltage, all memory cells being programmed to S6 will be subjected to a six bit line voltage and a six source line voltage, and all memory cells being programmed to S7 will be subjected to a seventh bit line voltage and a seventh source line voltage.

FIG. 9A depicts the results of applying the first Vpgm program pulse. FIG. 9A shows the target data states in solid lines and shows the actual threshold voltage distributions in dashed lines 802, 804, 806, 808, 810, 812 and 814. Actual threshold voltage distribution 802 represent the threshold voltage distribution for memory cells being programmed to data state S1. Actual threshold voltage distribution 804 represent the threshold voltage distribution for memory cells being programmed to data state S2. Actual threshold voltage distribution 806 represent the threshold voltage distribution for memory cells being programmed to data state S3. Actual threshold voltage distribution 808 represent the threshold voltage distribution for memory cells being programmed to data state S4. Actual threshold voltage distribution 810 represent the threshold voltage distribution for memory cells being programmed to data state S5. Actual threshold voltage distribution 812 represent the threshold voltage distribution for memory cells being programmed to data state S6. Actual threshold voltage distribution 814 represent the threshold voltage distribution for memory cells being programmed to data state S7. Note that the height/magnitude of the actual threshold voltage distributions 802, 804, 806, 808, 810, 812 and 814 is somewhat exaggerated in Figures A-E in order to make the drawings easier to read.

FIG. 9B depicts the results of applying the second Vpgm program pulse. As depicted, actual threshold voltage distributions 802, 804, 806, 808, 810, 812 and 814 have moved toward higher voltages.

FIG. 9C depicts the results of applying the third Vpgm program pulse. As depicted, actual threshold voltage distributions 802, 804, 806, 808, 810, 812 and 814 have moved toward higher voltages.

FIG. 9D depicts the results of applying the fourth Vpgm program pulse. As depicted, actual threshold voltage distributions 802, 804, 806, 808, 810, 812 and 814 have moved toward higher voltages.

FIG. 9E depicts the results of applying the fifth Vpgm program pulse. As depicted, actual threshold voltage distributions 802, 804, 806, 808, 810, 812 and 814 have moved toward higher voltages. After the sixth Vpgm program pulse, the actual threshold voltage distributions should be the same (or close) to the threshold voltage distributions depicted in FIG. 8.

FIG. 10 is a flow chart describing one embodiment of a process for performing programming on memory cells connected to a common word line to one or more targets (e.g., data states or threshold voltage ranges). The process of FIG. 10 is one example of how to implement the behavior depicted in FIG. 9. The process of FIG. 10 can also be used to implement programming strategies different than that of FIG. 9.

Typically, the program voltage applied to the control gate during a program operation is applied as a series of program pulses. Between programming pulses the system will perform verification. In many implementations, the magnitude of the program pulses is increased with each successive pulse by a predetermined step size. In step 868 of FIG. 10, the programming voltage (Vpgm) is initialized to the starting magnitude (e.g., ˜19V or another suitable level) and a program counter PC maintained by state machine 112 is initialized at 1. In step 870, data dependent voltages are individually applied to the different bit lines and the different source lines. Data dependent voltages are voltages that vary based on the data pattern being programmed. More details of step 870 are discussed below with respect to FIG. 11. In step 872, a program pulse of the program signal Vpgm is applied to the selected word line (the word line selected for programming) In one embodiment, the group of memory cells being programmed concurrently are all connected to the same word line (the selected word line). In step 872, the program pulse is concurrently applied to all memory cells connected to the selected word line.

In step 874, it is determined whether the program counter PC is less than K. In one embodiment, K=6, which means that the programming process will apply six programming pulses. The number 6 is based on the assumption that the natural VT distribution is about 3V wide and that the average VT shift up per program pulse is 0.5V, so3.0/0.5=6 pulses. If the step size is changed or the assumption about the width of the natural distribution is wrong, then more or less pulses are needed. The technology described herein can be used with K>6 and K<6. The natural distribution is the response (i.e. new VT distribution) of a group of cells to a single program pulse when the same program pulse (or the same sequence of program pulses) is (are) applied to all of cells. The group of cells can be composed of, for example, all cells to be programmed on a word line, all cells to be programmed to a particular state on a word line, all cells on a block, all cells on a chip, all cells across many chips depending on the context in which the term natural VT distribution is used. Generally the larger the group of cells under consideration, the wider the natural distribution from end to end. FIG. 9A shows seven different natural distributions (one per program state) where each one is the outcome for cells to be programmed to a particular state, and these outcomes are different due to the fact that lower states' programming is retarded by virtue of applied higher voltages to the cell's source line and bit line. The lower the state, the higher the retarding potential transferred to its channel by application of these higher voltages to bit lines and source lines.

If the program counter PC is less than K, then the process continues at step 876, during which all of the memory cells being programmed are concurrently verified for all target data states using a single read voltage pulse on the selected word line and data dependent voltages on individual bit lines and individual source lines. Memory cells that verify successfully will be locked out from further programming for the remainder of the programming process. In step 878, the Program Counter PC is incremented by 1 and the program voltage Vpgm is stepped up to the next magnitude. After step 878, the process loops back to step 870 and another program pulse is applied to the selected word line. In one embodiment, the six program pulses are at 19v, 19.6v, 20.2v, 20.8v, 21.4v and 22v.

If, in step 874, it is determined that the program counter is not less than K (i.e. PC=K) then the programming process of FIG. 10 is complete. In this embodiment, there is no verification performed for the last program pulse. In other embodiments, verification can be performed for the last program pulse and the system can (optionally) determine whether enough memory cells have been successfully programmed

Steps 870-878 implement a loop of applying programming and then verifying. This process is performed in an iterative fashion to avoid over programming

FIG. 11 is a table that identifies one embodiment of data dependent source line voltages and bit line voltages for programming, verifying and reading. Step 870 of FIG. 10 includes applying data dependent voltages to individual source lines and bit lines for programming The second column of FIG. 11 (header of “Program”) identifies the data dependent voltages applied to individual source lines and the seventh column of FIG. 11 (header of “Program”) identifies the data dependent voltages applied to individual bit lines. For example, if a memory cell is being programmed to state 51, then in step 870 the source line receives 4.8 volts and the bit line receives 4.8 volts. If a memory cell is being programmed to state S2, then in step 870 the source line receives 4.0 volts and the bit line receives 4.0 volts. If a memory cell is being programmed to state S3, then in step 870 the source line receives 3.2 volts and the bit line receives 3.2 volts. If a memory cell is being programmed to state S4, then in step 870 the source line receives 2.4 volts and the bit line receives 2.4 volts. If a memory cell is being programmed to state S5, then in step 870 the source line receives 1.6 volts and the bit line receives 1.6 volts. If a memory cell is being programmed to state S6, then in step 870 the source line receives 0.8 volts and the bit line receives 0.8 volts. If a memory cell is being programmed to state S7, then in step 870 the source line receives 0.0 volts and the bit line receives 0.0 volts. If the memory cell is to remain in the erased state S0, then in step 870 the source line receives 6.0 volts and the bit line receives 6.0 volts. Once a decision has been made based on one of the verify operations to lock out any particular cell from further programming (due to cell's VT exceeding its verify level), then from that point on the cell/column will be treated the same way as an erased cell (i.e. it will be locked out of further programming by boosting or other methods that inhibit programming)

Step 876 of FIG. 10 includes applying data dependent voltages to individual source lines and bit lines for verifying. The fourth column of FIG. 11 (second row has header of “Verify”) identifies the data dependent voltages applied to individual source lines and the ninth column of FIG. 11 (second row has header of “Verify”) identifies the data dependent voltages applied to individual bit lines. For example, if a memory cell is being programmed to state 51, then in step 876 the source line receives 4.8 volts and the bit line receives 5.3 volts. If a memory cell is being programmed to state S2, then in step 876 the source line receives 4.0 volts and the bit line receives 4.5 volts. If a memory cell is being programmed to state S3, then in step 876 the source line receives 3.2 volts and the bit line receives 3.7 volts. If a memory cell is being programmed to state S4, then in step 876 the source line receives 2.4 volts and the bit line receives 2.9 volts. If a memory cell is being programmed to state S5, then in step 876 the source line receives 1.6 volts and the bit line receives 2.1 volts. If a memory cell is being programmed to state S6, then in step 870 the source line receives 0.8 volts and the bit line receives 1.3 volts. If a memory cell is being programmed to state S7, then in step 874 the source line receives 0.0 volts and the bit line receives 0.5 volts. If the memory cell is to remain in the erased state S0, then in step 874 the source line receives 6.0 volts and the bit line receives 6.0 volts.

Step 870 of FIG. 10 also include locking out memory cells that have been successfully verified to have reached their target data state. The third column of FIG. 11 (second row has header of “Lock out”) identifies the data dependent voltages applied to individual source lines and the tenth column of FIG. 11 (second row has header of “Lock out”) identifies the data dependent voltages applied to individual bit lines. In all cases, when a memory cell is locked out the source line and bit line are set at 6 volts. Any memory cell that should be inhibited from programming has its source line and bit line set to 6.0 volts, as per the third and eighth columns of FIG. 11 (second row has header of “Inhibit”). Note that the numerical values listed in FIG. 11 are examples, and other values can also be used.

Because memory cells being programmed to lower states receive higher source line voltages and bit line voltages, the programming pulses will cause the memory cells being programmed to lower states to increase threshold voltage at a lower rate, as per the graphs of FIGS. 9A-E. Similarly, because memory cells being verified for lower states receive higher source line voltages and bit line voltages, the verification test can use the same single verification voltage pulse on the selected word line. FIG. 13 shows a sample voltage signal applied to a selected word line. There are six Vpgm program pulses 557, 558, 559, 560, 561 and 562 that increase in magnitude, as described above. One of the program pulses is applied during each iteration of step 872 of FIG. 10. Between the Vpgm program pulses are verify pulses 570. That is, between any two Vpgm program pulses is one verify pulse that is used to concurrently verify all data states by using different source line and bit line voltages as per the table of FIG. 11. Concurrently verifying all data states saves considerable time during a programming process. One verify pulse 570 is applied during each iteration of step 874.

FIG. 12 is a table that provides example voltages for the drain side select signal (VSGD), source side select signal (VSGS), selected word line (WL N), unselected word lines on the source side of the selected word line (WL#<N−1), and unselected word lines on the drain side of the selected word line (WL#>N+1). For example, during verify operations the selected word line receives one voltage pulse at 5.2 volts, while the unselected word lines, source side select signal for the selected NAND string, and drain side select signal for the selected NAND string receive 6 volts, and while the unselected word lines on the source side receive 12 volts. Other voltages than 6V can be applied, and engineering optimization will determine the best voltages to apply to unselected word lines, various source side select gates, and various drain side select gates during both verify and program operations. During programming, the selected word line receives Vpgm (see FIG. 13), while the unselected word lines on the drain side, the source side select signal for the selected NAND string, and the drain side select signal for the selected NAND string receive 6 volts, and while the unselected word lines on the source side receive 12 volts. During reading, the selected word line receive Vcgr (ie Vr1, Vr2, Vr3, Vr4, Vr5, Vr6 or Vr7), the source side select signal for the selected NAND string receives 4 volts, the drain side select signal for the selected NAND string receives 4 volts, and all unselected word lines receive 7 volts. Note that the numerical values listed in FIG. 12 are examples, and other values can also be used.

FIG. 14 is a flow chart describing one embodiment of a process for verifying that is performed as part of step 876 of FIG. 10. The process of FIG. 14 is one way to concurrently verify all data states, which is much faster than previous approaches which verify data states serially one data state at a time. In step 902, the individual bit lines receive a data dependent signal, as discussed above, and the individual source lines receive a data dependent signal, as discussed above. In some embodiments the system continues to apply the voltage (i.e. hold the voltage), rather than bring down the voltages applied to bit lines and source lines at the end of a program pulse by discharging them, only to recharge them back up to the same (or similar) voltage for following verify operation. This saves energy.

In step 904, the drain side selection signal is applied. In step 906, the source side selection signal is applied. Steps 904 and 906 can be performed concurrently or sequentially. If performed sequentially, either 904 or 906 can be performed first. In step 910, the set of sense amplifiers concurrently perform sensing operation for all (or a subset) of the memory cells for all data states. That is, the system will sense for S0, S1, S2, S3, S4, S5, S6 and S7 at the same time. Note that in some embodiments, for verifying after the first program pulse, WLs, SGSs, SGDs, BLs, & SLs can all start to rise together in order to save time. They will reach final voltage values at different times.

In another embodiment, the system can start ramping up (raising the voltage of) the word lines, the select gates, the bit lines, and the source lines all together for the selected finger (i.e. word line). The bit lines and the source lines can be slower to rise due to either their RC time constants being longer or the energy requirements being more (which would necessitate an intentional controlled ramp up of these ˜600,000 lines in order not to exceed maximum allowed instantaneous currents), in some embodiments word lines and select gates will reach high voltages before bit lines and source lines reach high voltages. Note that one embodiment charges the bit lines and the source lines in two stages: stage 1 takes lines to VCC or less, and stage 2 takes those lines that have to go to higher than VCC values from VCC to these higher values. Each stage is allotted a minimum of 20 micro seconds based on worst case bit line or source line RC time constants. The maximum time for each stage is based on how many cells will require their bit lines and source lines to be raised in voltage during the BL/SL charging phase which occurs before each program pulse. Some program pulses will have very few numbers of BLs and SLs charging up to high voltages (e.g. charge ups for program pulses #2 & #6, for which the circuit is RC dominated and 20us per stage will be adequate. But there are other charging phases when the system needs to allow more than 20us per one or both stages of charge up before the associated program pulse. Thus, there is a pulse by pulse control of ramp up time and pulse dependent charge up times. There may be a lot of BL & SL charge up activity prior to program pulse #1.

FIG. 15 is a flow chart describing one embodiment of a process for reading. Unlike verification, reading is performed sequentially. That is, the system will perform a read operation for one data state at a time. In one embodiments, the system will first read to determine which memory cells are in S0, then S1, then S2, . . . S7. In other embodiments, other orders can be implemented. Each data state is associated with its own word line voltage, referred to as Vcgr (i.e. Vr1, Vr2, Vr3, Vr4, Vr5, Vr6 or Vr7). In step 950, the Vcgr voltage for the compare level (i.e. Vr1, Vr2, Vr3, Vr4, Vr5, Vr6 or Vr7) is applied to the selected word line. Additionally, the unselected word lines receive the voltages indicated in FIG. 14. In step 952, the drain side selection signal is applied. In step 954, the source side selection signal is applied. In step 956, the common bit line voltage is applied to all bit lines. In step 958, the common source line voltage is applied to all source lines. In step 960, the sense amplifiers will sense data for the Vcgr applied in step 950. If there are more compare levels to apply (step 962), then the process loops back to step 950. In one set of embodiments, there are seven compare levels, so there will be seven iteration of steps 950-960. When there are no more compare levels to evaluate (step 962), then the process continues at step 964 the system determines which data state each memory cell read is in and what the corresponding data stored is. That data is reported to the host.

Note that the processes of FIGS. 10 (programming), 14 (verifying) and 15 (reading) can be performed together in any combination, separate, concurrently, serially or in another manner.

In one embodiment, erasing is performed in the same manner as in the prior art. In another embodiment, erasing is performed by taking advantage of Gate Induced Drain Leakage (GIDL). In another embodiment, erasing is performed by the “gated diode effect,” which comprises electron-hole generation assisted by bias across a PN junction and a nearby gate's assisting in increasing the electric field needed to generate electron hole pairs.

In one embodiment, immediately after programming, a read operation is performed to make sure that the bit error rate is sufficiently low. Note that ECC can be used to fix a number of bit errors.

FIG. 16 is a table of voltages used during programming and verification of programming for selected word lines (WLn) in selected block for program/verify and shows the transition of waveforms for the first two program pulses and the associated verify operations. The remaining operations are the same as the second program pulse (i.e. repeats of stages 2.1 to 2.7). The last program pulse (6th pulse in this example) does not require a verify in some embodiments and its stages 6.1 to 6.5 are similar to other program pulses' corresponding stages. An addition step 6.6 during which all lines are brought back to ground will bring the program verify sequence to an end. The voltages and timings serve as examples and can be different in various scenarios. Even the sequence of events can be changed to some extent. Other than the first column of labels, each column shows voltages during a different stage of operation. The first program pulse has seven stages: 1.1, 1.2, 1.3, 1.4, 1.5, 1.6 and 1.7. The second program pulse also has seven stages: 2.1, 2.2, 2.3, 2.4, 2.5, 2.6 and 2.7. Stages 1.1, 1.2 as well as 2.1 and 2.2 are an example implementation of step 870 of FIG. 10. Stages 1.4 and 1.5 as well as 2.4 and 2.5 are example implementation of step 872 of FIG. 10. Stages 1.7 and 2.7 are example implementations of step 876 of FIG. 10, as well as the process of FIG. 14. The table shows voltages for the four source side select lines (SGS0, SGS1, SGS2, SGS3), the four drain side select lines (SGD0, SGD0, SGD0, SGD3), the two drain side dummy word lines (WLDD1, WLDD2), the two drain side dummy word lines (WLDS1, WLDS2), the selected word line WLN, unselected word lines (WL0, WL<N−1, WLN−1, WLN+1, WL>N+1, WL47), source lines and bit line. With respect to the stage number, the digit to the left of the decimal point indicates the program pulse associated with the iteration of the programming process and the digit to the left of the decimal point refers to the sub stage (0.1-0.7).

The first two sub stages for the first program pulse include setting the various bit line and source voltages to their data dependent values. This is done in two stages, with the first stage (1.1) bringing the bit lines and source line to the lower of their target or VCC (˜3.1v). The other signals are depicted to transition from 0 to the values noted. For example, SGS0 shows “0→8” which represents a transition from 0 volts to 8 volts. In the second stage (1.2), the bit lines and source lines are raised from VCC to their targets (if they were not already at their targets). In third stage (1.3), the drain side select lines and source side select lines are lowered. The third stages (i.e. 1.3, 2.3, . . . , 6.3) can be eliminated in some embodiments for all program verify pulses. If they are to be eliminated, then select gate source and drain voltages are raised only to 6v as opposed to 8v in the first stages (i.e. 1.1, 2.1, ...,6.1). In the fourth stage, the word lines are raised to Vpass (e.g., 7-10 volts) to boost unselected NAND strings and prevent program disturb. In the fifth stage (1.5), the program pulse is applied. In the sixth stage (1.6), the system transitions to verify without bringing all of the signals down to 0 volts. In one embodiment, the system transitions to verify without bringing any of the listed signals down to 0 volts (or another resting or transition voltage). In the seventh stage, concurrent verification is performed. The stages for the second and subsequent program pulses are similar to the first program pulse, except in the first stage (e.g., 2.1, 3.1, 4.1, 5.1, and 6.1), the transition of voltages is from the previous verify voltage levels rather than 0. For some of the sub stages, the bit line voltage shows “x or 6” which represents applying the data dependent value x or 6 volts because the memory cell is locked out.

Note that chart of FIG. 16 shows the voltages for SGS0, SGS1, SGS2, and SGS3 as well as SGD0, SGD0, SGD0, and SGD3. The depicted voltages are for the instances when the particular select lines are selecting the NAND strings that include the selected memory cells. Typically, only one of SGS0, SGS1, SGS2, and SGS3 and only one of SGD0, SGD0, SGD0, and SGD3 will be turned on. In some embodiments SGS0, SGS1, SGS2, and SGS3 are tied to each other in any one block and can simply be referred to as SGS. In such case WL selection can be achieved by selectively turning on one of the 4 SGDs. In some embodiments SGD0, SGD1, SGD2, and SGD3 are tied to each other in any one block and can simply be referred to as SGD. In such case WL selection can be achieved by selectively turning on one of the 4 SGSs.

In one embodiment, the memory system does not necessarily have to have its bit lines below memory layers and its source line above. There can be embodiments with bit lines above the memory and source lines below the memory.

There is a description above of two stage charging for the bit lines and source lines. In other embodiments, three stage charging can be used for the bit lines and source lines. Three stage charging could become useful, if VCC<6/2=(BL/SL voltage for inhibit)/2. Then stage 1 takes the lines to VCC or below, stage 2 to takes the lines to slightly lower than 2*VCC, and stage 3 takes the lines to voltages above slightly lower than 3*VCC.

The above-described architecture reduces the number of program pulses and verify pulses, which results in an increase in performance of the memory system. As described, the time needed for verification is dramatically reduced as all states are verified simultaneously. Additionally, because the bit lines are below the stack, there is no need for bit line interconnects that run from below to above the stack, which saves space. Since there is only one word line per block per level, as opposed to multiple word lines on a level, the word line RC is reduced and less space is needed. Additionally, locating the bit line drivers (sense amplifiers) below the stack also save room on the integrated circuit.

If, in some embodiments, programming all states concurrently or verifying all states concurrently proves too costly (e.g. too much leakage or disturb, or too complex the BL drivers), the system can instead deploy a scheme that would break each program pulse into two sets: one set geared for states A, B, C, and the other set geared for states D, E, F, and G for example. For A to C states the Vpgm pulse will start at 16.2V, and when D to G are to be programmed the first pulse for these states starts at 19V. Verify can also be broken up into two sets. This provides semi-concurrency. It will reduce the performance gain and may increase energy per bit programmed, but it may be the last resort to mitigating junction leakages, disturb problem due to very high bit line and source line voltages of full concurrent program and verify, or reducing the number of transistor in each of the ˜300,000 bit line drivers. Also, since it will reduce bit line and source line voltage requirements, it will be able to eliminate or significantly reduce the need to pump up the bit line and source line voltages that have to charge up and maintain voltages significantly higher than VCC.

Another embodiment includes adding more pads to memory chips to bring in other voltage supplies in addition to VCC. For example, the system can bring as many as 6 other voltages from the outside onto the chip (not just 0v & VCC). An example would be to supply 0, 0.8, 1.6, 2.4, 3.2, 4.0, 4.8, & 6.0V from outside the chip. Another example is to bring 0, 0.9, 1.7, 2.5, 3.3, 4.1, 4.9, and 6.1V from outside and regulate them down to supply 0, 0.8, 1.6, 2.4, 3.2, 4.0, 4.8, & 6.0V. This will allow the memory chips to run a lot cooler by not having to use charge pumps to pump up about 600,000 bit lines and source lines to voltages that go as high as 6V. On solid states drives (e.g., SSDs and ESSDs) it may be easier to generate these voltages off chip. If these pads are on the chip and the circuits that accompany them as well, the system will have the option to use them or not depending on the type of product being offered.

Looking back at FIG. 8, full sequence programming is depicted by the arrows from data state S0 to data states S1-S7. Rather than implement full sequence programming, or in addition to implementing full sequence programming, some embodiments implement high states first programming (also known as HSF programming) In general, HSF programming is a multi-pass programming process. During an earlier pass of the multi-pass programming process memory cells to be programmed to the higher data states are programmed and during a later pass of the multi-pass programming process memory cells to be programmed to the lower data states are programmed In one embodiment, the higher data states includes S4, S5, S6 and S7, while the lower data states include S1, S2 and S3. In other embodiments, the higher data states and the lower data states can include different groupings of data states. In one embodiment, the multi-pass programming process includes two passes; however, other embodiments can use more than two passes,

FIG. 17A depicts programming a first pass of a multi-pass programming process that programs high states first. During this first pass, memory cells to be programmed from data state S0 to the higher data states S4, S5, S6 and S7 are programmed. The first pass can be implemented using the process of FIG. 10.

FIG. 17B depicts programming a second pass of the multi-pass programming process that programs high states first. During this second pass, memory cells to be programmed from data state S0 to the lower data states S1, S2 and S3 are programmed. The second pass can be implemented using the process of FIG. 10.

FIG. 18 is a table of voltages used during programming and verify for the first pass of the multi-pass programming process that programs high states first. The voltages depicted in FIG. 18 are used to implement the programming of FIG. 17A using the process of FIG. 10.

FIG. 19 is a table of voltages used during programming and verify for the second pass of the multi-pass programming process that programs high states first. The voltages depicted in FIG. 19 are used to implement the programming of FIG. 17B using the process of FIG. 10.

FIGS. 18 and 19 depict the voltages for the source line (SL) of a memory cell selected for programming, the appropriate source side select line of the four possible source side select lines (see 4× SGS), the source side dummy word lines (2×WLDS), the selected word line connected to the memory cells selected for programming (WLN), the unselected word lines (WL0, WL1→N−1 and WLN+1→46, WL47), the drain side dummy word lines (2×WLDD), the appropriate drain side select line of the four possible drain side select lines (see 4× SGD), and the bit line (BL) of a memory cell selected for programming. As discussed above, a word line connects to all memory cells on a same layer of the memory array 126. The source side select lines and the drain side select lines are used to select a subset (e.g., ¼) of memory cells (and NAND strings). Thus, programming (including verification) is performed on the selected subset (also referred to as a page). The source side select line and the drain side select line associated with the selected subset of memory cells (the source side select line and the drain side select line used to select the selected subset of memory cells) receive the voltages depicted in FIGS. 18 and 19. The source side select lines and the drain side select lines associated with the unselected subsets of memory cells receive 0 volts.

As discussed above, the proposed architecture uses multiple source lines and these source lines can carry different voltages. Therefore, there is a need for multiple independent source line drivers. Furthermore, during program and verify operations, the source line voltage and the bit line voltage are both data dependent, meaning that the magnitude of the voltages vary depending on the target data state for programming. As described above with respect to FIG. 1E, the bit line driver (sense block 130) includes latches to store the target data state for programming Based on the data stored in the latches, a bit line driver generates the appropriate bit line voltage. One proposed source line driver could also include latches to store the target data state for programming However, including the additional latches in the tens of thousands of source line drivers would be an inefficient use of space on memory die 108. Therefore, a smaller source line driver is proposed that does not include latches to store the target data state for programming. Instead, the proposed source line driver is connected to the source line and the bit line of a source line/bit line pair. The source line driver senses the bit line (which may be carrying a data dependent voltage) and generates a source line voltage based on the sensed bit line voltage. Thus, the bit line is providing the data dependent information to the source line driver, saving the source line driver from using real estate for latches.

FIG. 20 is a block diagram of one embodiment a source line driver 1002, which can be used to implement source line driver 1, source line driver 2, . . . source line driver p of source control circuits 127 (see FIG. 1B). Thus, the memory die 108 would include p source line drivers 1002. In one embodiment, source line driver 1002 includes shorting circuit 1020, detection circuit 1022 and voltage adjustment circuit 1024. FIG. 20 also shows one or more control circuits 1004 connected to source line driver 1002. One or more control circuits 1004 can be any of the control circuits described above. Detection circuit 1022 is connected to shorting circuit 1020, voltage adjustment circuit 1024 and one or more control circuits 1004. In one embodiment, shorting circuit 1020, detection circuit 1022 and voltage adjustment circuit 1024 are all connected to the bit line BL, while shorting circuit 1020 and voltage adjustment circuit 1024 are connected to the source line SL.

Detection circuit 1022 receives one or more control signals and current/voltage supplies from one or more control circuits 1004, referred to in the FIG. 20 by the signal lines marked “control.” The control signals can be used to configure source line driver 1002 to operate in a first mode for programming and a second mode for verifying/reading.

Detection circuit 1022 receives a benchmark voltage from one or more control circuits 1004. The term “benchmark voltage” is a name given to this reference signal. Detection circuit 1022 derives a reference from the benchmark voltage. Detection circuit 1022 compares the voltage on the bit line BL to the reference and provides the result of that comparison to shorting circuit 1020 and voltage adjustment circuit 1024. When verifying or reading, shorting circuit 1020 generates and applies a voltage to the source line SL that matches the voltage on the bit line BL if the voltage on the bit line is greater than or equal to the reference. When verifying or reading, voltage adjustment circuit 1024 generates and applies a voltage to the source line that is offset by a predetermined constant amount from the voltage on the bit line if the voltage on the bit line is less than the reference. In one embodiment, the predetermined constant amount is 0.5 volts; however, other amounts can also be used. In this manner, for example, during a verify process, the source line voltage is either the same as the bit line voltage or 0.5 volts less than the bit line voltage, which is in agreement with the tables of voltages discussed above. One or more control circuits 1004 can adjust the reference used by detection circuit 1022 by changing the applied benchmark voltage. When programming, one or more control circuits 1004 sends the appropriate one or more control signals and supplies to configure shorting circuit to apply a voltage to the source line SL that matches the voltage on the bit line BL.

FIG. 21 is a flow chart describing one embodiment of a process for operating a memory system with the proposed source line driver 1002. In one embodiment, memory system will have thousands or tens of thousands of source line drivers 1002, with one source line driver 1002 per source line. Thus, the process of FIG. 21 is performed concurrently on the thousands or tens of thousands of source line drivers 1002. In step 1102, one or more control circuits 1004 send control signals to the source line drivers indicating mode of operation to configure the source line drivers for programming, verifying or reading. In step 1104, one or more control circuits 1004 send control signals apply benchmark voltages to the multiple source line drivers 1002. In some embodiments the same benchmark voltage is applied to all source line drivers 1002, while in other embodiments different benchmark voltages are applied to different source line drivers 1002. One or more control circuits 1004 can adjust the reference used by detection circuit 1022 by changing the applied benchmark voltage. As will be discussed in more detail below, in one embodiment associated with a multi-pass programing process (see e.g., FIGS. 18 and 19), one or more control circuits 1004 apply the benchmark voltage at a first voltage level during a first pass of a multi-pass programming process and at a second voltage level during a second pass of the multi-pass programming process. More details are provided below.

In step 1106, the bit line drivers (e.g., sense amplifiers or sense blocks) generate bit line voltages. When performing programming or verify operations, the bit line voltages are generated as data dependent voltages based on the target data states for programming For example, FIGS. 11, 18 and 19 describe data dependent bit line voltages. As part of the programming process, Controller 122 loads the data to be programmed into the data latches 494 (see FIG. 1E). The data in these latches is used to generate the data dependent bit line voltages. During a read process, the bit line drivers will generate a pre-charge voltage, which can be approximately 0.5 volts (or other pre-set voltage).

In step 1108, source line drivers 1002 apply voltages to different source lines based on the sensed information about the voltages on the different bit lines. If performing programming or verify, source line drivers 1002 apply different voltages to different source lines based on the sensed information about the different magnitudes of the voltages on the different bit lines. If performing a read operation, in one embodiment the same voltage will be applied to all source lines.

FIG. 22A is a flow chart describing one embodiment of a process for operating source line driver 1002 during verify or read operations. Thus, the process of FIG. 22A is one example of a more detailed implementation of step 1108 of FIG. 21. During verify or read operations, the one or more control circuits 1004 indicate to the detection circuit 1022 that verifying or reading is being performed via the control signals discussed above. In step 1150 of FIG. 22A, detection circuit 1022 (see FIG. 20) accesses/receives a voltage on the bit line BL. Additionally, detection circuit 1022 accesses/receives the benchmark voltage from the one or more control circuits 1004. The benchmark voltage is used by detection circuit 1022 to create a reference. In step 1152, detection circuit 1022 compares the bit line voltage to the reference and provides the output of that comparison to shorting circuit 1020 and voltage adjustment circuit 1024. In step 1154, shorting circuit 1020 generates/applies a voltage for the source line that matches the bit line voltage if the bit line voltage is greater than or equal to the reference. In step 1156, voltage adjustment circuit generates/applies a voltage for the source line that is offset by a predetermined amount from the bit line voltage if the bit line voltage is less than the reference. In some embodiments, the offset can be a variable amount.

FIG. 22B is a flow chart describing one embodiment of a process for operating source line driver 1002 during programming operations. Thus, the process of FIG. 22B is one example of a more detailed implementation of step 1108 of FIG. 21. During programming operations, the one or more control circuits 1004 indicate to the detection circuit 1022 whether programming is being performed via the control signals discussed above. Detection circuit 1022 is configured to access/receive the bit line voltage in step 1180. Detection circuit 1022 causes shorting circuit 1022 to generate/apply a voltage to the source line that matches the voltage on the bit line in step 1182, in response to receiving the indication from the one or more control circuits 1004 that programming is being performed.

FIG. 23 is a schematic diagram of one embodiment of a source line driver. For example, the electrical circuit of FIG. 23 is one example implementation of source line driver 1002 of FIG. 20. FIG. 23 includes a box around circuit 1302, which functions as a current and voltage supply circuit 1302. In one embodiment, current and voltage supply circuit 1302 is part of control circuits 1004. The components depicted in FIG. 23 that are not part of circuit 1302 comprise a single source line driver. The memory die 108 will include one such source line driver for every source line. Therefore, there will be many source line drivers. In one embodiment, memory die 108 will include only one current and voltage supply circuit 1302 that connects to all source line drivers. In other embodiments, memory die 108 will include a small number of one current and voltage supply circuits 1302 (e.g., one per plane).

The source line driver of FIG. 23 includes NMOS transistors N1, N2, N3, N4, N5, N6, N7 and N8; and PMOS transistors P1, P2, P3, P4 and P5. FIG. 23 also shows signal lines 1310, 1312, 1314, 1316, 1318, 1320, 1322, 1324, 1326, 1328 and 1330 that carry signals to all (or multiple) source line drivers.

Current and voltage supply circuit 1302 comprises current source 1350 (2uA) connected to NMOS transistor N12, current source 1352 (1uA) connected NMOS transistor N10, voltage supply 1356 (−0.7v) connected NMOS transistor N10, current source 1354 (1uA) connected NMOS transistor N11, and voltage supply 1358 (−0.2v) connected NMOS transistor N11. Current and voltage supply circuit 1302 (or a different component of control circuits 1004) connected signal line 1314 to ground. Current and voltage supply circuit 1302 (or a different component of control circuits 1004) drives the benchmark voltage BMV (discussed above) on signal line 1316. In embodiments that perform the HSF multi-pass programming discussed above, during the first pass when high data states are programmed BMV=3.25v-VTP and during the second pass when low data states are programmed BMV=2.45v-VTP. For full sequence programming, either same (or a different) voltage can be made to work. VTP is the threshold voltage of PMOS transistors P1, P2, P3, P4 and P5. In one embodiment, VTP=0.5v; therefore, the benchmark voltage is 2.75 volts for the first pass when high data states are programmed and benchmark voltage is 1.95 volts for the second pass when low data states are programmed

In one embodiment, PMOS transistors P1, P2, P3, P4 and P5 all have the same threshold voltage. In other embodiments, PMOS transistors P1, P2, P3, P4 and P5 have different threshold voltages and VTP is the threshold voltage of PMOS transistor P2.. In one embodiment, NMOS transistors N1, N2, N3, N4, N5, N6, N7 and N8 all have the same threshold voltage, VTN. In other embodiments, NMOS transistors N1, N2, N3, N4, N5, N6, N7 and N8 have different threshold voltages. In one embodiment, VTP=0.5v and VTN=0.5v; however, other values can also be used and the circuit can be adjusted accordingly. Current and voltage supply circuit 1302 also includes switches 1360 and 1362.

Signal line 1318 is connected to NMOS transistors N12, N5 and N6. Signal line 1320 is connected to NMOS transistors N12 and N5, as well as switch 1360. Signal line 1322 is connected to NMOS transistor N6 and switch 1360. Signal line 1324 is connected to NMOS transistors N4 and N10, as well as current source 1352. Signal line 1326 is connected to NMOS transistors N4 and N10, as well as voltage supply 1356. Signal line 1328 is connected to NMOS transistors N3 and N11, as well as voltage supply 1358. Signal line 1330 is connected to NMOS transistors N3 and N11, as well as voltage supply 1358.

Signal lines 1310 and 1312 can be driven by current and voltage supply circuit 1302 (or a different component of control circuits 1004) or another on-chip supply. The voltage applied to signal line 1310 is labeled as VA. In embodiments that perform the HSF multi-pass programming discussed above, during the first pass when high data states are programmed VA=4.0 volts and during the second pass when low data states are programmed VA=3.2 volts. For full sequence programming, either same (or a different) voltage can be made to work. The voltage applied to signal line 1320 is labeled as VB. In embodiments that perform the HSF multi-pass programming discussed above, during the first pass when high data states are programmed VB=3.5 volts and during the second pass when low data states are programmed VB=2.7 volts. For full sequence programming, either (or a different) voltage can be made to work.

NMOS transistor N1 is connected to signal line 1310, the bit line (VBL), PMOS transistor P2, NMOS transistor N2 and NMOS transistor N7. NMOS transistor N2 is connected to signal line 1312, NMOS transistor N1, PMOS transistor P1, NMOS transistor N8 and PMOS transistor P5. PMOS transistor P1 is connected to the source line (VSL), NMOS transistor N2, PMOS transistor P5 and signal line 1314. PMOS transistor P2 is connected to the bit line (VBL), signal line 1316 providing the benchmark voltage BMV, PMOS transistor P3 and NMOS transistor N5. PMOS transistor P3 is connected to the bit line (VBL), PMOS transistor P2, NMOS transistor N6 and NMOS transistor N7. NMOS transistor N8 is connected to NMOS transistor N7, NMOS transistor N4, PMOS transistor P4 and PMOS transistor P5. NMOS transistor N7 is connected to NMOS transistor N1, NMOS transistor N3, NMOS transistor N6, NMOS transistor N8 and PMOS transistor P3.

NMOS transistors N3, N4, N5 and N6 operate as leakers that pull down the connected nodes when the pull up from those nodes is not strong. For example, if P2, P3, N7 and/or N8 are not strongly being pulled up, then the leakers will pull them down.

In one embodiment, detection circuit 1022 of FIG. 20 corresponds to transistors P2, P3, P5, N3, N4, N5, N6, N7 and N8; shorting circuit 1020 of FIG. 20 corresponds to transistor P4; and voltage adjustment circuit 1024 of FIG. 20 corresponds to transistors N1, N2 and P1. In the embodiment depicted in FIG. 23, therefore, the detection circuit, the shorting circuit and the voltage adjustment circuit consist of PMOS and NMOS transistors having a substantially similar magnitude of threshold voltage (e.g., VTP=VTN=0.5 volts). In other embodiments, the detection circuit, the shorting circuit and the voltage adjustment circuit comprise PMOS transistors having a different magnitude of threshold voltage than the NMOS transistors.

The source line driver of FIG. 23 can operate in two modes: (1) program mode and (2) verify/read mode. The source line driver changes between the two modes via switches 1360 and 1362, which are controlled by control circuits 1004 (e.g., state machine 112).

To operate in (1) program mode, switch 1360 is open and switch 1362 is closed, based on control signals from the control circuits 1004. The source of N5 will be driven at 4.1v. The gate of N5 will also be driven at 4.1v, providing 3.6v at the gate of P3. Additionally, 4.2 volts is driven on signal line 1316 to cause P2 to turn off and allow N5 to pass 4.2-0.5 (VTP)=3.7 volts, which will turns off P3. The source of N6 is driven to −0.7v. N6 passes the −0.7v to turn on P4 and P5. When on, P4 effectively shunts the bit line (VBL) to the source line (VSL); therefore, during programming, the source line driver is applying a voltage to the source line that matches the voltage sensed on the bit line. This behavior is in agreement with the tables of FIGS. 18 and 19 which show the voltage on the bit line being the same as the voltage on the source during programming

To operate in (2) verify/read mode, switch 1360 is closed and switch 1362 is open, based on control signals from the control circuits 1004. Therefore, signal lines 1320 and 1322 are driven to the same voltage causing the sources of N5 and N6 to be driven to the same voltage. In one embodiment, signal line 1322 is at 0.7v during programming and at ground during verify.

During verify, the bit line voltage VBL is compared against a reference by transistor P2. In the embodiment of FIG. 23, the reference is 3.25 volts for the first pass of the multi-pass programming when high data states are programmed and 2.45 volts for the second pass of the multi-pass programming when low data states are programmed The reference has a mathematical relationship to the benchmark voltage. For example, in the embodiment of FIG. 23, the reference is equal to the sum of the benchmark voltage and VTP. Alternatively said, the reference is 0.5 volts greater than the benchmark voltage. The control circuits 1004 can adjust the reference by applying the benchmark voltage to the detection circuit (e.g., to P2).

If the bit line voltage VBL is greater than or equal to the reference (3.25v during first pass or 2.45v during second pass), then P2 turns ON which causes P3 to turn OFF. N7 and N8 will also turn off, P4 turns ON and P5 turns ON. Because P4 is ON, VSL is effectively shorted to VBL via P4. Having P5 on, prevents P1 from turning on so that P1 (voltage adjustment circuit) does not contribute to the voltage of the source line. So, during verify, if VBL is greater than the reference, the source line driver is applying a voltage to the source line that matches the voltage sensed on the bit line.

If VBL is less than the reference (3.25v during first pass or 2.45v during second pass), the P2 is OFF, which causes P3 to turn ON. Because P3 is on, VBL is provided at the gate of N7, which turns ON N7, which turns on N8. Because N8 is ON, P4 turns OFF, so VSL is not shorted to VBL via P4.

When VBL is less than the reference, the drain of N1 is at VBL−VTN, the drain of N2 is at VBL−2*VTN and VSL is at VBL−2*VTN+VTP. Since, in this embodiment, VTN=VTP then the source line voltage VSL is one threshold voltage drop below the bit line voltage VBL. As, in this embodiment, VTN=VTP=0.5 volts, when VBL is less than the reference then the voltage adjustment circuit is configured to apply a voltage VSL to the source line that is offset from the voltage VBL by 0.5v (a predetermined constant that is also a first multiple of a threshold voltage of a transistor in the voltage adjustment circuit). The circuit can be modified to add additional NMOS transistors so that the offset between VBL and VSL is a different multiple of a threshold voltage of a transistor in the voltage adjustment circuit.

The discussion of the operation of the source line driver of FIG. 23 during verify is in agreement with the tables of FIGS. 18 and 19 which show that when VBL is at higher voltages (e.g., 3.6v during first pass and 2.8v during second pass) the source line driver is applying a voltage to the source line that matches the voltage on the bit line. When VBL is at lower voltages (e.g., 2.9v or below during first pass and 2.1v or below during second pass) the source line driver is applying a voltage to the source line that is 0.5v below the voltage on the bit line.

When reading, one embodiment pre-charges the bit line voltage to 0.5 volts. In response, the source line driver will drive the source line at 0 volts (ie SL=VBL−2VTN+VTP=0.5-2(0.5)+0.5=0).

When locking out a memory cell, the one or more control circuits drive a high bit line voltage which causes the source line voltage to match the bit line voltage and no current will flow through the NAND string.

FIG. 24 is a schematic diagram of another embodiment of a source line driver. For example, the electrical circuit of FIG. 24 is another example implementation of source line driver 1002 of FIG. 20. FIG. 24 includes a box around circuit 1402, which functions as a current and voltage supply circuit 1402. In one embodiment, current and voltage supply circuit 1402 is part of control circuits 1004. The components depicted in FIG. 24 that are not part of circuit 1402 comprise a single source line driver. The memory die 108 will include one such source line driver for every source line. Therefore, there will be many source line drivers. In one embodiment, memory die 108 will include only one current and voltage supply circuit 1402 that connects to all source line drivers. In other embodiments, memory die 108 will include a small number of one current and voltage supply circuits 1402 (e.g., one per plane).

The source line driver of FIG. 24 includes NMOS transistors N1, N3, N5, N6, and N7; and PMOS transistors P1, P2, P3, P4 and P5. One difference between the circuit of FIG. 23 and the circuit of FIG. 24 is that the PMOS transistors of FIG. 24 has a lower threshold voltage (VTP=0.2v) and the NMOS transistors have a higher threshold voltage (VTN=0.7v). FIG. 24 also shows signal lines 1310, 1312, 1314, 1316, 1318, 1320, 1430, 1432 and 1434 that carry signals to all (or multiple) source line drivers.

Current and voltage supply circuit 1402 comprises current source 1350 (2uA) connected to NMOS transistor N12, voltage supply 1356 (−0.7v) switch 1452, current source 1354 (1uA) connected NMOS transistor N11, and voltage supply 1358 (−0.2v) connected NMOS transistor N11. Current and voltage supply circuit 1302 also includes switches 1450 and 1452. Current and voltage supply circuit 1402 (or a different component of control circuits 1004) connects signal line 1314 to ground. Current and voltage supply circuit 1402 (or a different component of control circuits 1004) drives the benchmark voltage BMV (discussed above) on signal line 1316. In embodiments that perform the HSF multi-pass programming discussed above, during the first pass when high data states are programmed BMV=3.65v-VTP and during the second pass when low data states are programmed BMV=2.45v-VTP. For full sequence programming, either same (or a different) voltage can be made to work. In one embodiment, VTP=0.2v; therefore, the benchmark voltage is 3.45 volts for the first pass when high data states are programmed and benchmark voltage is 2.25 volts for the second pass when low data states are programmed

Signal line 1318 is connected to NMOS transistors N12, N5 and N6. Signal line 1320 is connected to NMOS transistors N12 and N5, as well as switch 1360. Signal line 1430 is connected to NMOS transistor N6 and switches 1450 and 1452. Signal line 1432 is connected to NMOS transistors N3 and N11, as well as current source 1354. Signal line 1434 is connected to NMOS transistors N3 and N11, as well as voltage supply 1358.

Signal lines 1310 and 1312 can be driven by current and voltage supply circuit 1302 (or a different component of control circuits 1004) or another on-chip supply. The voltage applied to signal line 1310 is labeled as VA. In embodiments that perform the HSF multi-pass programming discussed above, during the first pass when high data states are programmed VA=4.0 volts and during the second pass when low data states are programmed VA=3.2 volts. For full sequence programming, either same (or a different) voltage can be made to work. The voltage applied to signal line 1320 is labeled as VB. In embodiments that perform the HSF multi-pass programming discussed above, during the first pass when high data states are programmed VB=3.5 volts and during the second pass when low data states are programmed VB=2.7 volts. For full sequence programming, either same (or a different) voltage can be made to work.

NMOS transistor N1 is connected to signal line 1310, the bit line (VBL), PMOS transistor P1, PMOS transistor P2, PMOS transistor P3, PMOS transistor P5 and NMOS transistor N7. PMOS transistor P1 is connected to the source line (VSL), NMOS transistor N1, NMOS transistor N7, PMOS transistor P5 and signal line 1314. PMOS transistor P2 is connected to the bit line (VBL), signal line 1316 providing the benchmark voltage BMV, PMOS transistor P3 and NMOS transistor N5. PMOS transistor P3 is connected to the bit line (VBL), PMOS transistor P2, NMOS transistor N6 and NMOS transistor N7. NMOS transistor N7 is connected to NMOS transistor N1, NMOS transistor N3, NMOS transistor N6, PMOS transistor P1, PMOS transistor P3 and PMOS transistor P5. NMOS transistors N3, N5 and N6 operate as leakers that pull down the connected nodes when the pull up from those nodes is not strong.

In one embodiment, detection circuit 1022 of FIG. 20 corresponds to transistors P2, P3, P5, N3, N5, N6, and N7; shorting circuit 1020 of FIG. 20 corresponds to transistor P4; and voltage adjustment circuit 1024 of FIG. 20 corresponds to transistors N1 and P1. In the embodiment depicted in FIG. 24, therefore, the detection circuit, the shorting circuit and the voltage adjustment circuit consist of PMOS transistors having a different magnitude of threshold voltage than the NMOS transistors.

The source line driver of FIG. 24 can operate in two modes: (1) program mode and (2) verify/read mode. The source line driver switches between the two modes via switches 1450 and 1452, which are controlled by control circuits 1004 (e.g., state machine 112).

To operate in (1) program mode, switch 1450 is open and switch 1452 is closed, based on control signals from the control circuits 1004. The source of N5 will be driven at 4.1v. The gate of N5 will also be driven at 4.1v, providing 3.6v at the gate of P3. Additionally, 4.2 volts is driven on signal line 1316 to cause P2 to turn off and allow N5 to pass 4.2-0.5 (VTP)=3.7 volts, which will turns off P3. The source of N6 is driven to −0.7v. N6 passes the −0.7v to turn on P4 and P5. When on, P4 effectively shunts the bit line (VBL) to the source line (VSL); therefore, during programming, the source line driver is applying a voltage to the source line that matches the voltage sensed on the bit line. This behavior is in agreement with the tables of FIGS. 18 and 19 which show the voltage on the bit line being the same as the voltage on the source during programming

To operate in (2) verify/read mode, switch 1450 is closed and switch 1452 is open, based on control signals from the control circuits 1004. Therefore, signal lines 1320 and 1430 are driven to the same voltage causing the sources of N5 and N6 to be driven to the same voltage. In one embodiment, signal line 1430 is at 0.7v during programming and at ground during verify.

During verify, the bit line voltage VBL is compared against a reference by transistor P2. In the embodiment of FIG. 24, the reference is 3.65 volts for the first pass of the multi-pass programming when high data states are programmed and 2.45 volts for the second pass of the multi-pass programming when low data states are programmed The reference has a mathematical relationship to the benchmark voltage. For example, in the embodiment of FIG. 24, the reference is equal to the sum of the benchmark voltage and VPT. Alternatively said, the reference is 0.2 volts greater than the benchmark voltage. The control circuits 1004 can adjust the reference by applying the benchmark voltage to the detection circuit (e.g., to P2).

If the bit line voltage VBL is greater than or equal to the reference (3.65v during first pass or 2.45v during second pass), then P2 turns ON which causes P3 to turn OFF. N7 will also turn off, P4 turns ON and P5 turns ON. Because P4 is ON, VSL is effectively shorted to VBL via P4. Having P5 on, prevents P1 from turning on so that P1 (voltage adjustment circuit) does not contribute to the voltage of the source line. So, during verify, if VBL is greater than the reference, the source line driver is applying a voltage to the source line that matches the voltage sensed on the bit line.

If VBL is less than the reference (3.65v during first pass or 2.45v during second pass), then P2 is OFF, which causes P3 to turn ON. Because P3 is on, VBL is provided at the gate of N7, which turns ON N7. Because N7 is ON, P4 turns OFF, so VSL is not shorted to VBL via P4.

When VBL is less than the reference, the drain of N1 is at VBL−VTN and VSL is at VBL−VTN+VTP. Since, in this embodiment, VTN≠VTP then the source line voltage VSL is offset from the voltage on the bit line by an amount equal to a difference in threshold voltages of transistors N1 and P1 (both in the voltage adjustment circuit). Thus, when VBL is less than the reference then the voltage adjustment circuit is configured to apply a voltage VSL to the source line that is offset from the voltage VBL by a predetermined constant that is an amount equal to a difference in threshold voltages of transistors N1 and P1.

The discussion of the operation of the source line driver of FIG. 24 during verify is in agreement with the tables of FIGS. 18 and 19 which show that when VBL is at higher voltages (e.g., 3.6v during first pass and 2.8v during second pass) the source line driver is applying a voltage to the source line that matches the voltage on the bit line. When VBL is at lower voltages (e.g., 2.9v or below during first pass and 2.1v or below during second pass) the source line driver is applying a voltage to the source line that is 0.5v below the voltage on the bit line.

When reading, one embodiment pre-charges the bit line voltage to 0.5 volts. In response, the source line driver will drive the source line at 0 volts (i.e. SL=VBL−2VTN+VTP=0.5-2(0.5)+0.5=0).

When locking out a memory cell, the one or more control circuits drive a high bit line voltage which causes the source line voltage to match the bit line voltage and no current will flow through the NAND string.

One embodiment includes an apparatus, comprising: a detection circuit connected to a bit line of a non-volatile memory array, the detection circuit configured to compare a voltage on the bit line to a reference; a shorting circuit connected to the bit line, a source line of the non-volatile memory array, and the detection circuit, the shorting circuit configured to apply a voltage to the source line that matches the voltage on the bit line if the voltage on the bit line is greater than the reference; and a voltage adjustment circuit connected to the bit line, the source line, and the detection circuit, the voltage adjustment circuit configured to apply a voltage to the source line that is offset by a predetermined constant amount from the voltage on the bit line if the voltage on the bit line is less than the reference.

One embodiment comprises an apparatus, comprising: a plurality of non-volatile memory cells configured to form a monolithic three dimensional memory structure; a plurality of bit lines connected to the memory cells; a plurality of source lines connected to the memory cells; a plurality of bit line drivers connected to the bit lines; and a plurality of source line drivers connected to the source lines and the bit lines, the source line drivers configured to apply voltages to the source lines based on bit line voltages.

One embodiment includes a method comprising: receiving a bit line voltage on a bit line connected to a source line driver, the source line driver is also connected to a source line; if programming, applying a voltage to the source line that matches the bit line voltage; and if verifying, then applying a voltage to the source line that matches the bit line voltage if the bit line voltage is greater than a reference and applying a voltage to the source line that is offset from the bit line voltage if the bit line voltage is less than the reference.

One embodiment includes an apparatus, comprising: a plurality of non-volatile memory cells forming a monolithic three dimensional memory array; multiple separate and isolated source lines connected to the memory cells; multiple separate and isolated bit lines connected to the memory cells, each bit line is paired with a different source line to form source line/bit line pairs; and means for driving the source lines based on bit line voltages during verify, the means for driving the source lines are connected to the bit lines and the source lines.

For purposes of this document, reference in the specification to “an embodiment,” “one embodiment,” “some embodiments,” or “another embodiment” may be used to describe different embodiments or the same embodiment.

For purposes of this document, a connection may be a direct connection or an indirect connection (e.g., via one or more others parts). In some cases, when an element is referred to as being connected or coupled to another element, the element may be directly connected to the other element or indirectly connected to the other element via intervening elements. When an element is referred to as being directly connected to another element, then there are no intervening elements between the element and the other element. Two devices are “in communication” if they are directly or indirectly connected so that they can communicate electronic signals between them.

For purposes of this document, the term “based on” may be read as “based at least in part on.”

For purposes of this document, without additional context, use of numerical terms such as a “first” object, a “second” object, and a “third” object may not imply an ordering of objects, but may instead be used for identification purposes to identify different objects.

For purposes of this document, the term “set” of objects may refer to a “set” of one or more of the objects.

The foregoing detailed description of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. The described embodiments were chosen in order to best explain the principles of the invention and its practical application, to thereby enable others skilled in the art to best utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims appended hereto. 

What is claimed is:
 1. An apparatus, comprising: a detection circuit connected to a bit line of a non-volatile memory array, the detection circuit configured to compare a voltage on the bit line to a reference; a shorting circuit connected to the bit line, a source line of the non-volatile memory array, and the detection circuit, the shorting circuit configured to apply a voltage to the source line that matches the voltage on the bit line if the voltage on the bit line is greater than the reference; and a voltage adjustment circuit connected to the bit line, the source line, and the detection circuit, the voltage adjustment circuit configured to apply a voltage to the source line that is offset by a predetermined constant amount from the voltage on the bit line if the voltage on the bit line is less than the reference.
 2. The apparatus of claim 1, further comprising a plurality of non-volatile memory cells; and one or more control circuits connected to the detection circuit and the memory cells, the one or more control circuits configured to adjust the reference used by the detection circuit by applying a benchmark voltage to the detection circuit, the reference has a mathematical relationship to the benchmark voltage.
 3. The apparatus of claim 2, further comprising: the one or more control circuits connected to the detection circuit and configured to indicate to the detection circuit whether programming is being performed, the detection circuit further configured to cause the shorting circuit to apply a voltage to the source line that matches the voltage on the bit line in response to receiving an indication from the one or more control circuits that programming is being performed.
 4. The apparatus of claim 1, further comprising a plurality of non-volatile memory cells; and one or more control circuits connected to the detection circuit and the memory cells, the one or more control circuits configured to adjust the reference used by the detection circuit by applying a first voltage to the detection circuit during a first pass of a multi-pass programming process and a second voltage during a second pass of the multi-pass programming process.
 5. The apparatus of claim 1, wherein: the detection circuit, the shorting circuit and the voltage adjustment circuit consist of PMOS and NMOS transistors having a substantially similar magnitude of threshold voltage.
 6. The apparatus of claim 1, wherein: the detection circuit, the shorting circuit and the voltage adjustment circuit comprise PMOS transistors and NMOS transistors, the PMOS transistors having a different magnitude of threshold voltage than the NMOS transistors.
 7. The apparatus of claim 1, wherein: the voltage adjustment circuit is configured to selectively apply a voltage to the source line that is offset from the voltage on the bit line by a multiple of a threshold voltage of a transistor in the voltage adjustment circuit.
 8. The apparatus of claim 1, wherein: the voltage adjustment circuit is configured to selectively apply a voltage to the source line that is offset from the voltage on the bit line by an amount equal to a difference in threshold voltages of transistors in the voltage adjustment circuit.
 9. The apparatus of claim 1, further comprising: a plurality of non-volatile memory cells; and one or more control circuits connected to the detection circuit and the memory cells, the one or more control circuits are configured to apply a voltage on the bit line based on a target programming voltage level.
 10. An apparatus, comprising: a plurality of non-volatile memory cells configured to form a monolithic three dimensional memory structure; a plurality of bit lines connected to the memory cells; a plurality of source lines connected to the memory cells; a plurality of bit line drivers connected to the bit lines; and a plurality of source line drivers connected to the source lines and the bit lines, the source line drivers configured to apply voltages to the source lines based on bit line voltages.
 11. The apparatus of claim 10, wherein: each bit line is paired with a different source line to form source line/bit line pairs; and each source line driver comprises: a detection circuit connected to a bit line of a source line/bit line pair, a shorting circuit connected to the bit line, a source line of the source line/bit line pair and the detection circuit, and a voltage adjustment circuit connected to the bit line and the source line of the source line/bit line pair and the detection circuit.
 12. The apparatus of claim 10, wherein: the bit lines are positioned below the memory structure; the source lines are positioned above the memory structure; the bit line drivers are positioned below the memory structure; and the source line drivers are positioned to the side of the memory structure.
 13. The apparatus of claim 10, wherein: the memory cells form NAND strings in the monolithic three dimensional memory structure; each bit line is paired with a different source line to form source line/bit line pairs; and each source line/bit line pair is connected to a same set of NAND strings.
 14. The apparatus of claim 10, further comprising storage elements, each storage element accessible to one of the bit line drivers and the source line drivers for a paired bit line and source line, each storage element configured to store a programming target and wherein: the bit line drivers drive data dependent voltages on the bit lines during programming and verify, the data dependent voltages based on the programming targets.
 15. The apparatus of claim 10, wherein each source line driver comprises: a detection circuit connected to a bit line of a source line/bit line pair, the detection circuit is configured to detect whether a voltage on the bit line is greater than a reference; a shorting circuit connected to the bit line and a source line of the source line/bit line pair and the detection circuit, the shorting circuit is configured to apply a voltage to the source line that matches the voltage on the bit line if the voltage on the bit line is greater than the reference; and a voltage adjustment circuit connected to the bit line and the source line of the source line/bit line pair and the detection circuit, the voltage adjustment circuit is configured to apply a voltage to the source line that is offset by a predetermined constant amount from the voltage on the bit line if the voltage on the bit line is not greater than the reference.
 16. A method comprising: receiving a bit line voltage on a bit line connected to a source line driver, the source line driver is connected to a source line; if programming, applying a voltage to the source line that matches the bit line voltage; and if verifying, then applying a voltage to the source line that matches the bit line voltage if the bit line voltage is greater than a reference and applying a voltage to the source line that is offset from the bit line voltage if the bit line voltage is less than the reference.
 17. The method of claim 16, further comprising: generating the bit line voltage as a data dependent voltage based on a programming target for a non-volatile memory cell connected to the bit line and the source line.
 18. The method of claim 16, further comprising: adjusting the reference based on a type of memory operation being performed by sending different benchmark voltages to the source line driver for different memory operations.
 19. The method of claim 16, further comprising: sending a control signal to the source line driver to configure the source line driver for programming, verifying or reading.
 20. An apparatus, comprising: a plurality of non-volatile memory cells forming a monolithic three dimensional memory array; multiple separate and isolated source lines connected to the memory cells; multiple separate and isolated bit lines connected to the memory cells, each bit line is paired with a different source line to form source line/bit line pairs; and means for driving the source lines based on bit line voltages during verify, the means for driving the source lines are connected to the bit lines and the source lines. 